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(31) 

(32) 

(33) 
(43) 

(51) 



<52) 



(S6) 



(58) 



(71) 



(72) 



(74) 



(54) Genetic engineering 

(57) It has been a problem to find an 
alternative, less time-consuming, and 
more reliable source of factor IX, a 
polypeptide which is essential to the 
human blood-clotting process and 
necessary for the treatment of 
patients with Christmas disease. In 
order to aid in the solution of the 
problem, there is provided 
recombinant DNA containing a DMA 
sequence occurring in the human 
factor IX genome, and includes 
recombinant DNA comprising 
substantially the whole sequence of 
human factor IX genome, which is 



inserted in a cloning vehicle and 
transformed into a host, such as 
Escherichia coli. Other fragments of 
the sequence have also been cloned 
and the invention includes DNA 
molecules comprising part or all of the 
human factor IX DNA. There is also 
described cDNA derived from human 
factor IX RNA. Uses include the 
provision of an intermediate of value 
in the genetic engineering of a factor 
IX polypeptide precursor and thence 
manufacture of the factor IX 
polypeptide, and in making probes for 
use in diagnosing the presence of 
normal or abnormal factor IX DNA in 
patients with Christmas disease. 
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1st amino acid 70 

sequence : Glu-Cys-Trp-Cys-Gln-Ala 

mRNA : 5- GA^ UG^ UGG UG^ CaJ GCN 3 

Deoxyoligonucleotides s' CtJ AC^ ACC AC^ GTT CG (oligo N2A) 
synthesized : 

3* CtJ ACq ACC ACq GTC CG (oligo N2B) 



2nd amino acid 348 352 

sequence : His-Met-Phe-Cys-Ala 

mRNA : 5' CA^ AUG UU^ UG^ GCN 



Deoxyoligonucleotides ^ A A 



synthesized = GT^ TAC AAj ACj CG (oligo Ml) 
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I 60 

teSNPCLNGGMCKDDlNSY 
TGAATCCAATCCATGTTTAAATGGCGGCATGTGCAAGGATGACATTAATTCCTAT 
10 20 30 40 50 

70 80 90 
ECWCOAGFEGTNCELDATCSIK 
GAATGTTGGTGTCAAGC TGGATTTGAAGGAACGAACTGTGAATTAGATGCAACATGCAGCATTAA 
60 70~ 80 90 100 110 120 

100 

NGRCKOFCKRDTDNKVVC 
GAATG6CAGATGCAAGCAGTTTTGTAAAAGGGACACAGATAACAAGGTGGTTTGT 
130 140 150 160 170 

110 120 130 

SCTDGYRLAEDQKSCEPAVPFP 
TCCTGTACTGACGGATACCGACTTGCAGAAGACCAAAAGTCCTGC6AACCAGCAGTGCCATTTCC 
180 190 200 210 220 230 240 

140 150 
CGRVSVSHfVRPRFHGLCSC^El , 
CTGTGGACGAGTCTCTGTCTCACATCTGAGGCCCCGCTTTCACX5GTCTGTGTTCGTGCTGA6AA 3 
2S0 260 L 270 280 290 300 J 
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FIG, 8(a) 
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FIG. 8(b) 
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FIG. 8(c) 
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FIG. 8(d) 
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FIG. 8(e) 
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FIG. 8(f) 
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FIG. 8(g) 
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687A 


0.579 


HINF1 


GATTC 


691 1 


0.582 


EC0R1 


GAATTC 


691 6 


0. 582 


HPA11 


CCGG 


69 3 4 


0.538 


ALUI 


AGCT 


6991 


0.589 


HIMF1 


GACTC 


702 8 


0.592 


SAU961 


G6GCC 


70 2 5 


0.592 


HASH 1 


GGCC 


703 8 


0.593 


D0E1 


CTCAG 


705 2 


0.594 


FGK1 


GGATG 


705 6 


0.594 


SAU961 


GGGCC 


705 7 


0.594 


HAE111 


GGCC 


705 9 


0.594 


MNL1 


CCTC 


71 2 4 


0.600 


MB011 


TCTTC 


71 5 5 


0.603 


MS011 


GAAGA 


71 5 5 


0.603 


XMN1 


GAAGAGTTTC 


71 7 9 


0.605 


DDS1 


CTAAG 


71 8 2 


0.605 


ALU1 


AGCT 


71 8 5 


0.605 


HPH1 


TCACC 


71 9 4 


0. 606 


DDE1 


CT6AG 


719 6 


0.606 


MNL1 


GAGG 


7237 


0.609 


ALU1 


AGCT 


72 9 3 


0.614 


AVA1 


CTCGGG 


731 C 


0.616 


MB011 


GAAGA 


7313 


0.616 


SFNA1 


GATGC 


73 2 2 


0.617 


eSTNl 


CCAGG 


7322 


0.617 


SCRF1 


CCAGG 


734 3 


0.618 


RSA1 


5TAC 


7373 


0.621 


MGIA1 


GAGCTC 


737 3 


0.621 


SAC1 


GAGCTC 


7374 


0.621 


ALUl 


AGCT 


7376 


Q.021 


0DE1 


CTCAG 


-♦•737 8 


0.621 


PVU11 


CAGCTG 


7379 


0.621 


ALUl 


AGCT 


739 4 


0.623 


HAE1 1 1 


3GCC 


739 6 


0.62 3 


SSTNl 


CCAGG 


739 6 


0.623 


SCRF1 


CCAGG 


7408 


0.624 


D0E1 


CTGAG 


741 0 


0.624 


MNLI 


GAGG 


743 8 


0.626 


FOKI 


GGATG 


748 5 


0.630 


STU1 


AGGCCT 


7486 


0.630 


HAE111 


GGCC 


7488 


0.631 


MNLI 


CCTC 


750 7 


0.632 


HPH1 


GGTGA 


751 6 


0.633 


MNLI 


GAGG 


75 2 9 


0.634 


ALUl 


AGCT 


7S4 7 


0.636 


MR011 


GAAGA 



BNSDCXID; <GB 2125409A_I.> 



24/35 



2125409 



FIG. 8(i) 
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FIG. 8(j) 
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FIG. 8(k) 
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FIG. 8(L) 
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SPECIFICATION 
Genetic engineering 

BACKGROUND OF THE INVENTION 

1 . Field of the invention 

5 This invention is in the field of genetic 
engineering relating to factor IX DNA, 

2. Description of prior art 

Factor IX (Christmas factor or antihaemophilic 
factor B) is the zymogen of a serine protease 
1 0 which Is required for blood coagulation via the 
intrinsic pathway of clotting (Jackson &• 
Nemerson, Ann.Rev.Biochem. 49, 765 — 8 1 1 , 
1 980). This factor is synthesised in the liver and 
requires vitamin K for its biosynthesis (Di Scipio & 
1 5 Davie, Biochem. 18, 899 — 904, 1 979), 
Human factor IX has been purified and 
characterised, but details of the amino acid 
sequence are fragmentary. It is a single-chain 
glycoprotein, with a molecular weight of 
20 approximately 60,000 (Suomela, EurJ.Biochem. 
77^ 145 — 154, 1976). Like other vitamin K- 
dependent plasma proteins* human factor iX 
contains in the amino-termlnal region 
approximately 12 gamma-carboxygiutamic acid 
25 residues (Di Scipio & Davie, Biochem. 18, 
899—904,1979) 

During the clotting process, and In the presence 
of Ca ^"^ ions, factor IX is acted upon by activated 
factor IX (IXa) by the cleavage of two internal 
30 peptide bonds, releasing an activation 

glycopeptlde of 1 0,000 daltons (DI Scipio et aL, 
J.Clin. Invest. 528—1 538, 1978). The 
activated factor IX (IXa) is composed of two 
chains held together by at least one disulphide 
35 bond. Factor IXa then participates in the next step 
in the coagulation cascade by acting on factor X in 
the presence of activated factor VIII, Ca'*"'*' ions, 
and phospholipids (Lindquist et a!,, J.Biol.Chem. 
255, 1902— 1909, 1978). 
40 Individuals deficient in factor IX (Christmas 
disease or haemophilia B) show bleeding 
symptoms which persist throughout life. Bleeding 
may occur spontaneously or following injury. This 
may take place virtually anywhere. Bleeding Into 
45 the joints is common, and after repeated 
haemorrhages, may result in permanent and 
crippling deformities. The condition is a sex-linked 
disorder affecting males. Its frequency in the 
population is approximately 1 in 30,000 males. 
50 The current rtiethod of diagnosing Christmas 
disease involves measurement of the titre of factor 
IX In plasma by a combination of a clotting assay 
and In immunochemical assay. Treatment of 
haemorrhage in the disease consists of factor tX 
55 replacement by means of Intravenous transfusion 
of human plasma protein concentrates enriched in 
factor IX. The enrichment of plasma In factor iX is 
a time-consuming process. 

Summary of the invention 
60 After considerable research and experiment. 
Important progress has now been made towards 
producing artificial human factor IX by 



recombinant DNA technology (genetic 
engineering). Thus, the cloning of DNA sequences 

65 which are substantially the same as extensive 
sequences occurring in the human factor IX 
genome has been achieved. 

The invention arises from the finding that an 
extensive DNA sequence of the human factor IX 

70 genome can be obtained by a clever and laborious 
combination of chemical synthesis and artificial 
biosynthesis, starting from elementary nucleotide 
or dinucleotide "building blocks", as will be 
described below. 

75 A major feature of the Invention comprises 
recombinant DNA which comprises a cloning 
vehicle DNA sequence and a sequence foreign 
thereto (i.e. foreign to the vehicle) which Is 
substantially the same as a sequence occurring In 

80 the human factor IX genome. All 873 nucleotide 
long part of such a foreign sequence has been 
identified and a very large part of it has been 
sequenced by the Maxam-Gilbert sequencing 
method. A 1 29 nucleotide length of this sequence 

85 is more than sufficient to characterise it 

unambiguously as coding for a specific protein and 
a particular such length is regarded herein as 
useful to characterise the whole sequence 
inserted in the cloning vehicle as one occurring In 

90 the human factor IX genome. Other cloned 

sequences can then be verified as belonging to the 
human factor IX genome by determining that part 
thereof is identical to a region of the first- 
mentioned sequence, i.e. the sequences have a 

95 common identity in an overlapping region. 

A further feature of the Invention therefore 
comprises recombinant DNA which comprises a 
cloning vehicle or vector DNA sequence and a 
DNA sequence foreign thereto which consists of 
1 00 or includes substantially the following sequence of 
1 29 nucleotides (which should be read In rows of 
30 across the page): — 

ATGTAACATG TAACATTAAG AATGGCAGAT 
GCGAGCAGTT TTGTAAAAAT AGTGCTGATA 
105 ACAAGGTGGT TTGCTCCTGT ACTGAGGGAT 
ATCGACTTGC AGAAAACCAG AAGTCCTGTG 
AACCAGCAG (D 



The invention includes particularly recombinant 
DNA which comprises a cloning vehicle DNA 

110 sequence and a sequence foreign to the cloning 
vehicle, wherein the foreign sequence Includes 
substantially the whole of an exon sequence of the 
human factor IX genome. The 1 29-nucleotlde 
sequence described above corresponds 

1 1 5 substantially to such an exon sequence. Another 
such exon sequence which independently 
characterises the human factor IX DNA is the 203- 
nucleotide sequence substantially as follows 
(again reading in rows of 30 across the page): — 
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TGCCATTTCC ATGTGGAAGA GTTTCTGTTT 

CACAAACTTC TAAGCTCACC CGTGCTGAGG 

CTGI H MCC TGATGTGGAC TATGTAAATT 

CTACTGAAGC TGAAACCATT TTGGATAACA 

5 TCACTCAAAG CACCCAATCA TTTAATGACT 

TCACTCGGGT TGTTGGTGGA GAAGATGCCA 

AACCAGGTCA ATTCCCTTGG CAG 

The intron sequences of the human factor IX 
genome are excised during the transcription 

10 process by which mRNA is made in human cells. 
Only exon sequences are translated into protein. 
DNA coding for factor IX has been prepared from 
human mRNA. This cDNA has been partly 
sequenced and found to contain the same 129- 

1 5 and 203-nucleotide sequences set out above. 

The invention also Includes recombinant DNA 
which comprises a cloning vehicle sequence and a 
DNA sequence foreign to the cloning vehicle, 
wherein the foreign sequence comprises a DNA 

20 sequence which is complementary to human 

factor IX mRNA. Such a recombinant cDNA can be 
isolated from a library of recombinant cDNA 
clones derived from human liver mRNA by using 
an exon of the genomic human factor (X DNA (or 

25 part thereof) as a probe to screen this library and 
thence isolating the resulting clones. 

The invention also includes recombinant DNA 
in which the foreign sequence is any fragment of 
human factor IX DNA, particularly of length at 

30 least 50 and preferably at least 75 nucleotides or 
base-pairs. It includes such recombinant DNA 
whether or not part of the 1 29 or 203-base-pair 
sequence defined above. It includes especially part 
or all of the exon sequences of human factor IX 

35 genomic DNA. Various short lengths up to about 
1 1 kiiobases (1 1,000 nucleotides or base-pairs) 
long have been prepared by use of various 
restriction endonucleases. Methods of isolating 
recombinant DNA from clones are well known and 

40 some are described hereinafter. The DNA of the 
invention can be single or double stranded form. 

The recombinant human factor IX DNA of this 
invention is useful as a tool of recombinant DNA 
technology. Thus it Is useful as the first stage In 

45 the production of artificial human factor IX and in 
the preparation of probes for diagnostic purposes. 

In the production of the artificial human factor 
IX it is contemplated that appropriate cDNA or 
genomic clones will be introduced into a suitable 

50 expression vector in either mammalian or bacterial 
systems. For mammalian studies, the gene might 
be too long to be conveniently retained in one 
clone. Therefore a suitable artificial "minigene" 
will be designed and constructed from suitable 

55 parts of the cDNA and genomic clones. The 
minigene will be under the control of its own 
promoter or instead will be replaced by an artificial 
one, perhaps the mouse metallothioneine I 
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promoter. The resultant 'minigene' will then be 

60 introduced into mammalian tissue culture cells 
e.g. a hepatoma cell line, and selection for clones 
of cells synthesising maximum amounts of 
biologically active factor IX will be carried out. 
Alternatively "genetic farming" could be employed 

65 as has been demonstrated for mouse growth 
hormone (Palmiter et al. Nature 300. 61 1 — 615. 
1 982). The minigene would be micro-injected into 
the pronucleus of fertilised eggs, followed by in 
vivo cloning and selection for progeny producing 

70 the largest quantity of human factor IX in blood. 
Alternatively, it is contemplated that the cDNA 
clone or selected parts of it will be linked to a 
suitable strong bacterial promoter, e.g. a Lac or 
Trp promoter or the lamdba or and a factor 

75 IX polypeptide obtained therefrom. 

The natural factor IX polypeptide is synthesised 
as a precursor containing both a signal and 
propeptide region. They are both normally cleaved 
off in the production of the definitive length 

80 protein. Even this product is merely a precursor. It 
is biologically inactive and must be gamma- 
carboxylated at 12 specific N-terminal glutamic 
acid residues in the so called 'GLA' domain by the 
action of a specific vitamin K-dependent 

85 carboxylase. In addition, two carbohydrate 
molecules are added to the connecting peptide 
region of the molecule, but is remains unknown 
whether they are required for activity. The 
substrate for the carboxylase is unknown and 

90 could be the precursor factor IX polypeptide or 
alternatively the definitive length protein. 
Therefore various relevant polypeptides both with 
and without the precursor domains will be 
"constructed" using genetic engineering methods 

95 in bacterial hosts. They will then be tested as 
substrates for the conversion of inactive to 
biologically active factor IV in vitro by the action of 
partially purified preparations of the carboxylase 
enzyme which can be isolated from liver 
100 microsomes or other suitable sources. 

For diagnostic purposes, the recombinant 
human genomic factor IX DNA or recombinant 
human mRNA-derived factor IX DNA has a wide 
variety of uses, ft can be cleaved by enzymes or 
1 05 combinations of two or more enzymes into shorter 
fragments of DNA which can be recombined into 
the cloning vehicle, producing "sub-clones". 
These sub-clones can themselves be cleaved by 
restriction enzymes to DNA molecules suitable for 
110 preparing probes. A probe DNA (by definition) is 
labelled in some way, conveniently radiolabelled, 
and can be used to examine in detail mutations in 
the human DNA which ordinarily would produce 
factor IX. Several different probes have been 
115 produced for examining several different regions 
of the genome where mutation was suspected to 
have occurred in patients. Failure to obtain 
hybridisation from such a probe indicates that the 
sequence of the probe differs in the patient's DNA 
120 In particular it has been shown that Christmas 
disease can be detected or confirmed by such 
methodology. Useful probes can contain intron 
and/or exon regions of the genomic DNA or can 
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contain cDNA derived from the mRNA, 

The invention includes particularly probe DNA, 
i.e. which is labelled, and of a length suitable for 
the probing use envisaged. It can be single- 
5 stranded or double-stranded over at least the 
human factor tX DNA probing sequences thereof 
and such sequences will usually have a length of 
at least 1 5 nucleotides, preferably at least 1 9 — 30 
nucleotides in order to have a reasonable 
1 0 probability of being unique They will not usually be 
larger than 5 kb and rarely longer than 1 0 kb. 

The invention accordingly Includes a DNA 
molecule, comprising part of the human factor IX 
DNA sequence, whether or not labelled, whether 
\ 5 intron or exon or partly both. It also includes 
human cDNA corresponding to part of all of 
human factor IX mRNA. It includes particularly a 
solution of any DNA of the invention, which is a 
form in which it is conveniently obtainable by 
20 electroelution from a gel. 

The invention includes, of course, a host 
transformed with any of the recombinant DNA of 
the invention. The host can be a bacterium, for 
example an appropriate strain of £.co//, chosen 
25 according to the nature of the cloning vehicle 
employed. Useful hosts may include strains of 
Pseudomonas, Bacillus subtilis and Bacillus 
stearothermophilus, other Bacilli, yeasts and other 
fungi and mammalian (including human) cells. 
30 One process practised in connection with this 
invention for preparing a host transformed with 
the recombinant DNA of the invention is based on 
the following steps: — 

(1) synthesising an oligodeoxynucleotide 
35 having a nucleotide sequence comprising that 
occurring in bovine factor IX messenger RNA 
coding for amino acids 70 — 75 or 348 — ^352 of 
bovine factor IX, and labelling the 
oligodeoxynucleotide to form a probe; 
40 (2) preparing complementary DNA to a mixture 
of bovine mRNAs; 

(3) Inserting the complementary DNA in a 
cloning vector to form a mixture of recombinant 
bovine cDNAs; 
45 (4) transforming a host with said mixture of 
recombinant bovine cDNAs to form a library of 
clones and multiplying said clones; 

(5) probing the clones with the synthetic 
oligodeoxynucleotide probe obtained in step 1 and 

50 isolating the resultant recombinant bovine factor 
IX cDNA-containing clone; 

(6) digesting the recombinant bovine factor IX 
cDNA from said clone with one or more enzymes 
to produce a bovine factor IX cDNA molecule 

55 comprising a shorter sequence of bovine factor IX 
DNA, but preferably at least 50 base-pairs long; 
and 

(7) probing a library of recombinant human 
genomic DNA in a transformed host with the 

60 shorter sequence bovine factor IX cDNA molecule, 
to hybridise the human genomic DNA to the said 
recombinant bovine factor IX DNA and isolating 
the resultant recombinant DNA-transformed host. 



Brief description of the drawings 
65 Figure 1 shows the structure of a published 
amino-acid sequence of bovine factor IX 
polypeptide, the deduced sequence of the mRNA 
from which it would be translated and the 
structures of oligonucleotides (oligo-NI and N2) 
70 synthesised in the course of this invention; 

Figures 2 and 3 show the chemical formulae of 
"building blocks" used to synthesise the 
oligonucleotides referred to In Figures 1 and 1 1 ; 
Figure 4 is an elevational view, partly sectioned, 
75 showing an apparatus for synthesising 
oligonucleotides; 

Figure 5 shows the sequence of part of the 
bovine factor IX cDNA obtained in this invention; 
Figure 6 is a map showing the organisation of 
80 an approximately 27 kb length of human factor IX 
genomic DNA and Is divided into five portions, 
showing: — 

(a) the exon regions; 

(b) the 1 1 ,873- nucleotide length sequenced; 
85 (c) cDNA molecules obtained by restriction with 

various endonucleases, sub-cloned and 
subsequently used as probes; 

(d) DNA molecules obtained by restriction with 
various endonucleases; and 
90 (e) three regions of human factor IX genomic 
DNA derived from three clones in lambda phage 
vector. 

Figure 7 shows the sequence of the DNA of 
Figure 6(b) and in parts the encoded protein; 
95 Figure 8 shows a restriction enzyme chart of 
the sequence shown in Figure 7; 

Figure 9 shows part of the sequence of the 
human factor IX cDNA and its encoded protein; 

Figure 1 0 shows the structure of a pair of 
100 complementary oligonucleotides (oligo N3 and 
N4) synthesised in the course of this invention: 

Figure 1 1 shows part of the DNA sequence of 
the vector pATI 53/Pvull/8 of this invention, in the 
region where it differs from pATI 53; 
1 05 Figure 1 2 is a diagram of plasmid pHIXI 7 of 
the invention showing the origin of the 1.4 kb 
fragment used for probing and initial sequencing; 
and 

Figure 1 3 shows the position of the major 
110 radioactive bands on probing a "Southern blot" of 
normal human DNA, cut by the restriction 
enzymes £coRUE), HindWm), BglW^B) and ficll(Bc), 
with a sub-clone of the recombinant human factor 
IX DNA of this invention. 

1 1 5 DESCRIPTION OF PREFERRED EMBODIMENTS 
1 . General description 

A recombinant DNA of the invention can be 
extracted by mea^s of probes from a library of 
cloned human genomic DNA. This is a known 

1 20 recombinant library and the invention does not, of 
course, extend to human genomic factor IX DNA 
when present in such a library. The probes used 
were of bovine factor IX cDNA (DNA 
complementary to bovine mRNA), which were 
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prepared by an elaborate process involving firstly 
the preparation of recombinant bovine cDNA from 
a bovine mRNA starting material, secondly the 
chemical syntheses of oligonucleotides, thirdly 
5 their use to probe the recombinant bovine cDNA, 
In order to extract bovine factor IX cDNA and 
fourthly the preparation of suitable probes of 
shorter length from the recombinant bovine factor 
IX cDNA. The first probe tried appeared to contain 
10 an irrelevant sequence and the second probe tried 
not containing It, proved successful in enabling a 
single clone of the human genomic factor IX DNA 
to be isolated. This clone is designated lambda 
HIX — 1 . The steps Involved are described in more 
1 5 detail in the sub-section "Examples" appearing 
hereinafter, and the second probe comprises the 
247 base-pair DNA sequence of bovine factor IX 
cDNA indicated in Figure 5 of the drawings. The 
invention therefore provides specifically a 
20 recombinant DNA which comprises a cloning 

vehicle sequence and a DNA sequence foreign to 
the cloning vehicle, which recombinant DNA 
hybridises to a 247 base-pair sequence of bovine 
factor IX cDNA indicated in Figure 5 {by the 
25 arrows at each end thereof). 

The cloning vehicle or vector employed in the 
invention can be any of those known in the 
genetic engineering art (but will be chosen to be 
compatible with the host). They include E.coli. 
30 plasmids, e.g. pBR322, pATI 53 and modifications 
thereof, plasmids with wider host ranges, e.g. RP4 
plasmids specific to other bacterial hosts, phages, 
especially lambda phage, and cosmids. A cosmid 
cloning vehicle containing a fragment of phage 
35 DNA including its cos (cohesive-end site) inserted 
In a plasmid. The resultant recombinant DNA is 
circular and has the capacity to accommodate 
very large fragments of additional foreign DNA. 
Fragments of human factor IX genomic DNA 
40 can be prepared by digesting the cloned DNA with 
various restriction enzymes. If desired, the 
fragments can be religated to a cloning vehicle to 
prepare further recombinant DNA and thereby 
obtain "sub-clones". In connection with this 
45 embodiment a new cloning vehicle has been 
prepared. This Is a modified pATI 53 plasmid 
prepared by ligating a BamH\ and H/nd\\\ double 
digest of pATI 53 to a pair of complementary 
double sticky-ended oligonucleotides having a 
50 DNA sequence providing a BamH\ restriction 

residue at one end, a Hind\\\ restriction residue at 
the other end and a PvuW restriction site in 
between. 

While the invention Is described herein with 
55 reference to human genomic factor iX DNA in 
particular, the invention includes human factor IX 
cDNA (complementary to hq^an factor IX mRNA) 
which contains substantially the same sequences. 
A library of human cDNA has been prepared and 
60 probed with human factor IX genomic DNA to 

isolate human factor IX cDNAfrom the library. For 
this purpose the probe DNA is conveniently of 
relatively short length and must include at least 
one exon sequence. The Invention therefore 
65 includes a process of preparing a host transformed 
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with recombinant DNA, comprising cloning vector 
sequences and a sequence of nucleotides 
comprised in cDNA complementary to human 
factor IX mRNA, which process comprises probing 
70 a library of clones containing recombinant DNA 
complementary to human mRNA with a probe 
comprising a labelled DNA comprising a sequence 
complementary to part or all of an exon region of 
the human factor IX genome. 

75 2. Examples 

A. Bacteria used 

E,co//K — 12 strain MC 1061 (Casadaban & 
Cohen. J.Mol.Biol. 738, 179 — 207, 1980),£.ca// 
K — 12 strain HB 101 (Boyer & Roulland-Dussoix, 
80 J.Mol.Biol 4/, 459—472, 1969) and £.co// K— 12 
strain K803 which is a known strain used by 
genetic engineers. 

B. Source and purification of bovine factor IX, anti- 
85 bovine factor IX antibody, and bovine mRNA 

Highly purified bovine factor IX and rabbit anti- 
bovine factor IX antiserum were gifts from Dr. M. 
P. Esnouf. Analysis of the purified bovine factor IX 
on a denaturating polyacrylamide gel showed that 
90 it has a purity of greater than 99%. Specific anti- 
factor IX immunoglobulins used for 
immunoprecipitation experiments were purified as 
described by Choo et aL, Biochem J. 199, 
527 — 535, 1 98 1 , by passage of the crude 
95 antiserum through a Sepharose— 4B column onto 
which pure bovine factor IX has been coupled. 

Bovine mRNA was obtained from calf liver and 
isolated by the guanidine hydrochloride method 
(Chirgwin ef a/., Biochem. 18. 5294 — 5299, 

1 00 1 979). The mRNA preparation was passaged 

through an oligo dT-cellulose column (Caton and 
Robertson, Nucl. Acids Res. 7, 1445 — 1456. 
1 979) to isolate poly{A) + mRNA. 
Poly{A) + mRNA was translated in a rabbit 

1 05 reticulocyte cell-free system in the presence of 
^^S-cysteine as described by Pelham and Jackson 
(Eur. J.Btochem. 57, 247 — 256, 1976). At the * 
end of the translation reaction, factor IX 
polypeptide was precipitated by the addition of 

1 1 0 specific anti-factor IX immunoglobulins. The 

immunoprecipitation procedure was as described 
by Choo era/., Biochem.J. ;s;,2B5 — 294, 1979. 
The immunoprecipltated material was washed 
throughly and resolved on a two-dimensional 

115 SDS-polyacrylamide gel (Choo ef a/., Biochem.J. 
181, 285—294, 1 979), by isoelectric focussing In 
one dimension and electrophoresis in another. 
Some polypeptides of known molecular weight 
were subjected to this procedure, to serve as 

120 reference points. The immunoprecipitated material 
showed 4 pronounced spots, all in the 50,000 
molecular weight region and with separated 
isoelectric points. These predominant spots of 
molecular weight about 50,000 represent a single 

125 polypeptide chain plus a possible prepeptide 
signal sequence, a deduction compatible with 
published data (Katayama et aL, Proc. Natl.Acad. 
Sci.USA 76, 4990—4994, 1 979). 

When the gel analysis was repeated for the 
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same material but immunoprecipitated in the 
presence of unlabelled pure bovine factor IX, the 4 
spots appeared at reduced intensity, indicating 
that the translation product is specificaiiy 
5 competed for by pure factor IX. Thirdly, 
immunoprecipitation was performed using a 
control rabbit antiserum, i.e. from a rabbit which 
had not been immunised with factor IX. None of 
the 4 spots appeared. These results therefore 

1 0 indicate that the translation product was a factor 
IX polypeptide. 

The specific immunological/cell-free translation 
assay established above was used to monitor the 
enrichment of factor IX mRNA on sucrose gradient 

15 centrifugations. Total poly(A) + mRNA was 

resolved by two successive separations by sucrose 
gradient centrifugations. When individual fractions 
. from the gradient were assayed by the above 
method, a fraction of size 20 — 22 Svedberg units 

20 (approx. 2.5 kilobases of RNA) region was found 
to be enriched (approx. ten-fold) for the bovine 
factor IX mRNA. This enriched fraction was used 
in the subsequent cloning experiments. 

25 C. Synthesis of specific bovine factor IX 
deoxyoligonucleotide mixtures 

Starting from a knowledge of the amino acid 
sequence of bovine factor IX (Katayama et aL, 
Proc.Natl.Acad.Sci. USA 76, 4990 — 4994, 

30 1 979), the synthesis of two mixtures of 

oligonucleotide probes was designed. These 
probes consisted of DNA sequences coding for 
two different regions of the protein. The regions 
selected were those known to differ in sequence in 

35 the analogous serine proteases, prothrombin. 
Factor C and Factors Vil and X and were those 
corresponding to amino acids 70 — 75 and 
348 — 352 respectively. The 70 — 75 region was 
particularly favourable in that the mixture of 

40 oligonucleotides synthesised, i.e. oligo N2A and 
. oligo N2B, contained all 1 6 possible sequences 
that might occur in a 1 7 nucleotide long region of 
the mRNA corresponding to amino acids 70 — 75. 
The oligo N2A — N2B mixture is hereinafter called 

45 "oligo N2" for brevity. 

Figure 1 of the drawings shows the two 
selected regions of the known amino acid 
sequence of bovine factor IX, the corresponding 
mRNA and the oligonucleotides synthesised. 

50 Since some of the amino acids are coded for by 
more than one nucleotide triplet, there are 4 
ambiguities in the mRNA sequence shown for 
amino acids 70 — 75 and therefore 16 possible 
individual sequences. 

55 The nucleotide mixtures oligo N1 and oligo N2 
were synthesized using the solid phase 
phosphotriester method of Duckworth a/., 
NucLAclds Res. 5, 1691 — 1706, 1 981 , modified 
in two ways. Firstly, o-chlorophenyl rather than p- 

50 chlorophenyl blocking groups were used for the 
phosphotriester grouping, and were incorporated 
in the mononucleotide and dinucleotide "building 
blocks". Figures 2 and 3 of the drawings show (a) 
dinucleotide and (b) mononucleotide "building 

65 blocks". DMT = 4,4' - dimethoxytrityl and B = 6- 
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N-benzoyl-adenin~9-yl, 4-N-benzoylcytosin-1 -yl, 
2-N-isobutyrylguanin-9-yl or thymin-1 -yl, 
depending on the nucleotide selected. Secondly, 
the "reaction cell" used for the successive 
addition of mono- or dinucleotide "building 
blocks" was miniaturised so that the coupling step 
with the condensing agent 1-(mesitylene-2- 
sulphonyl)-3-nitro-1,2,4-triazole (MSNT) was 
carried out in a volume of 0.5ml pyridine 
containing 3.5 micromoles of 

polydimethylacrylamide resin, 1 7,5 micromoles of 
incoming dinucleotide ( or 35 micromoles of 
mononucleotide) and 210 micromoles of MSNT. 

Figure 4 of the drawings is an elevational view 
of the microreaction cell 1 and stopper 2 used for 
oligonucleotide synthesis, drawn 70% of actual 
size. The. device comprises a glass-to-PTFE tubing 
joint 3 at the inlet end of the stopper 2. The 
stopper has an internal conduit 4 which at its 
lower end passes into a hollow tapered ground 
glass male member 5 and thence into a sintered 
glass outlet 6 to the stopper. The eel! 1 has a 
ground glass female member 7 complementary to 
the member 5 of the stopper, leading to reaction 
chamber 8, the lower end of which terminates in a 
sintered glass outlet 9. This communicates with 
glass tubing 1 0 and a 1 .2mm. "Interflow" tap 1 1 . 
Further glass tubing 1 0, beyond the tap 1 1 , leads 
to the outlet glass-to-PTFE tubing joint 12. Pairs 
of ears 1 3 on the stopper and cell enable them to 
be joined together by springs (not shown) in a 
liquid-tight manner. 

After completion of the synthesis and 
deprotection, fractionation was carried out by high 
pressure liquid chromatography (Duckworth et al., 
see above) and the peak tubes corresponding to 
the product of correct chain length were located 
by labelling effractions at their 5'-hydroxyl ends 
using [gamma-^^p]-ATP and T4 polynucleotide 
kinase, followed by 20% 7M urea polyacrylamide 
gel electrophoresis. The position on the gel of the 
1 7- and 1 4- oligonucleotides was determined by 
separately labelling, by the method described 
above, 1 7- and 1 4- nucleotide long "marker" 
oligonucleotides and subjecting these to the same 
gel electrophoresis. 

D. Preparation of libraries of cDNA sequences for 
bovine mRNA 

Two different approaches were used for the 
generation of cloned cDNA library: — 

(i) Mbol library First strand cDNA was 
synthesised using the sucrose gradient-enriched 
poly(A)+bovine mRNA as template. The 
conditions used were as described by Huddleston 
a Brownlee, Nucl. Acids Res. 70, 1029 — 1030, 
1 981 , except that 2 micrograms of oligo N — 1 , 
20 — 30 micrograms of the mRNA, 1 0 microcuries 
[alpha-22p].tj|;^TP (Amersham, 3000 Ci/mmole), 
and 50 U of reverse transcriptase were used in a 
50 microlitre reaction. "dNTP ' in Figure 1 denotes 
the mixture of 4 deoxynucleoside triphosphates 
required for synthesis. Oligo N — 1 hybridises to 
the corresponding region on the mRNA (refer to 
Figure 1 ) and thereby acts as a primer for the 



BNSDOCID: <GB 2125409A_I > 



6 



GB 2 125 409 A 6 



Initiation of transcription. It was used in order to 
achieve a further enrichment for factor IX mRNA. 
At the end of the cDNA synthesis reaction, the 
cDNA was extracted with phenol and desalted on 
5 a Sephadex-G 1 00 column, before it was treated 
with alkali (0.1 M NaOH, 1 mM EDTA) for 30 min. at 
60°C to remove the mRNA strand. Second strand 
DNA synthesis was then carried out exactly as 
published (Huddieston & Brownlee, Nucl.Acids 

10 Res. ;0, 1029—1038, 1981). 

The double-stranded DNA was next cleaved 
with the restriction enzyme Mbol and ligated to 
the piasmid vector pBR322 which had been cut 
with BamH\ and treated with calf intestinal 

<! 5 alkaline phosphatase to minimise vector self- 

religation. Phosphatase treatment was carried out 
by incubating 5 micrograms BamH\-cuX pBR 
322 piasmid with 0.5 microgram calf intestinal 
phosphatase (Boehringer; in 1 0mM Tris — HCI 

20 buffer, pH 8.0) in a volume of 50 mtcrolitres at 
37 °C for 1 0 minutes, see Huddieston & Brownlee 
supra. 

The ligated DNA was used to transform E.coii 
strain MC 1 061 . For transformation E.coii MC 

25 1 061 was grown to early exponential phase as 
indicated by an absorbancy of 0.2 at 600 nm and 
made "competent" by treating the pelleted 
bacterial cells first with one half volume, followed 
by repelleting, and then with 1/50 volume of the 

30 original growth medium of 1 0OmM CaCI^ 1 5% v/v 
glycerol and 1 0mM PIPES — NaOH, pH 6.6 at 0°C. 
Cells were immediately frozen in a dry ice/ethanol 
bath to — 70**C. For transformation, 200 microlitre 
aliquots were mixed with 10 microiitres of the 

35 recombinant DNA and incubated at 0°C for 1 0 
minutes followed by 37°C for 5 minutes. 200 
microiitres of L-broth (bactotryptone 1 0g., yeast 
extract 5g., sodium chloride 1 0g., made up to 1 
litre with deionised water) were then added and 

40 incubation continued for a further 30 minutes at 
37^C. The solution was then plated on the 
appropriate antibiotic agar (see below). A library of 
about 7,000 ampicillin-resistant colonies was thus 
obtained. They were ampicillin-resistant because 

45 they contained the beta-la eta mase gene of pBR 
322. Of these, aprox. 85% were found to be 
tetracycline-sensitive. 

(ii) dC/dG tailed library In the preparation of this 
library, first strand cDNA was synthesised as 

50 described for the above library except that oligo 
^^(12-18) was used as a primer to initiate cDNA 
synthesis. Following this, the cDNA was tailed 
with dCTP using terminal transferase and back- 
copied with the aid of oligo dGj^j-is) pnmer and 

55 reverse transcriptase to give double stranded 
DNA, exactly according to the method of Land et 
a/., NucLAclds Res. 9, 2251 — 2266. 1 981 . After a 
further tailing with dCTP, this material was 
annealed by hybridisation to a dGTP-tailed 

60 pBR322 piasmid at the Pst\ site. The hybrid DNA 
was used to transform E.coU strain MC 1 06 1 . A 
library of approximately 1 0,000 tetracycline- 
resistant colonies was obtained. Of these, 
approximately 80% were found to be sensitive to 

65 ampicillin, due to insertion of DNA into the 
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ampicillin-resistant gene at the Pst\ site. 

E. Isolation of specific bovine factor IX clones 
(i) From l\/lbol library 

The library of colonies, in an unordered fashion, 

70 was transferred onto 13 Whatman 541 filter 
papers and amplified with chloramphenicol, to 
increase the number of copies of the piasmid in 
the colonies, as described by Gergen et al., Nucl. 
Acids Res., / , 2 1 1 5—2 1 36 ( 1 979). The filters 

75 were pre-hybridised at 65°C for 4h in 6 x NET 
( 1 X NET = 0. 1 5m NaCI, 1 mM EDTA, 1 5mM Tris- 
HCI, pH 7.5), 5 X Denhardt s, 0.5% NP40 non- 
ionic surfactant, and 1 microgram/ml. yeast RNA 
as described by Wallace et al., Nucl. Acids Res. 5, 

80 879 — 894 (1981). Hybridisation was carried out 
at 47 °C for 20h in the same solution containing 
3 X lO^cpm (0.7 nanogram/ml) of labelled oligo 
N — 2 probe. Labelling was done by 
phosphorylation of the oligonucleotides at the 5' 

85 hydroxy! end using [gamma-^^Pj-ATP and T4 
phophokinase <Huddleston & Brownlee, 
Nucl.Acids Res. 10. 1029—1038, 1981). At the 
end of the hybridisation, filters were washed 
successively at O — 4°C (2h), 25°C (10 min), 

90 37^*0 (10 min) and 47°C (10 min). After 

radioautography of the filters from this screening, 
one colony showed a positive signal above 
background. This colony was designated BIX — 1 
clone. 

95 (ii) From dC/dG-tailed library 

Screening of this library, in an ordered array 
fashion, using oligo N — 2 probe as described 
above has resulted in the identification of a 
positive clone. This was designated BIX — 2 clone 

1 00 Sequence characterisation of bovine factor IX 
cDNA clones 

Characterisation of BIX — 1 clone by restriction 
endonuclease cleavage indicated that it contained 
a DNA insert of about 430 base-pairs (data 

1 05 omitted, for brevity). Figure 5 shows part of the 
nucleotide sequence of the coding strand, 
determined by the Maxam-Gilbert method, 
extending over 304 nucleotides and provides 
direct evidence that it has the identity of a bovine 

1 1 0 factor IX sequence. Thus, nearly all of this 304 
nucleotide sequence (corresponding to the amino 
acid residues 52 — 139) agrees with the 
nucleotide sequence predicted from the known 
bovine factor IX amino acid sequence data 

1 1 5 (Katayama et aL, Proc.NatLAcad.Sci. 75, 

4990- — 4994, 1 979). Over this region, there are 
no discrepancies between BIX — 1 and these 
published data for factor IX, except at nucleotides 
38 — 40 where the amino acid coded for is Asp 

120 instead of Thr. This amino acid change was 

similarly observed in a second, independent cONA 
clone (BIX — 2; see below). The remainder of the 
304-nucleotide sequence, i.e. that shown in 
brackets in Figure 5, does not agree with the 

125 published bovine factor IX amino acid data of 
Katayama. 

In Figure 5, the underiined portion denotes the 
sequence corresponding to the oiigo N — 2 probe 
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sequence, the asterisk denotes a nonsense codon, 
the brackets enclose a sequence which does not 
correspond to Katayama's amino acid data and 
the arrows indicate Hinf\ restriction sites. The 
Katayama numbering system for amino acids is 
shown and this sequence is in the opposite 
orientation to the direction of transcription of the 
tetracycline-resistant gene of the plasmid. 

By similar methods, BIX — 2 clone was found to 
have a DNA insert of 102 nucleotides and this 
spans the nucleotide positions 7 — 108 as shown 
in Figure 5. The nucleotide sequences for BIX — 1 
and BIX — 2 clones over this region (nucleotide 
7 — 108) were identical. 

G. Isolation of human factor IX gene 

(i) Initial clone — lambda MIX — / 

A library of cloned human genomic DNA, 
namely a /yaelll/Alul lambda phage Charon 4A 
library prepared by Lawn et a/.. Cell, 75, 
1 1 57 — 1 1 74, 1 978, was used. 1 0® phage 
recombinants from this library were screened 
using the in situ plaque hybridisation procedure as 
described by T. Manlatis et aL, Cell, 75, 687, 
1 978. Pre-hybridisation and hybridisation were 
carried out at 42 °C In 50% formamide. After 
hybridisation, filters were washed at room 
temperature with 2 x SSC (1 x SSC = 0.1 5mM 
NaCI, 1 5mM sodium citrate, at pH 7.2) and 0.1% 
SDS, then at 6B^C with 1 x SSC and 0.1 % SDS. 

Two DNA molecules, being restriction 
fragments from the factor IX cDNA cloned in 
BIX — 1, were radiolabelled and used as probes in 
the hybridisation. The first fragment corresponds 
to nucleotide numbers —8 to 31 7 on the 
numbering system of Figure 5, and was isolated 
by Sau3A\ digestion of BIX — 1 plasmid DNA. The 
isolated DNA was labelled to high specific activity 
by Incorporation of [alpha — ^'^P] -dATP using a 
nick translation (Rigby et a/., J. Mol.Biol. / 13, 
237 — 251 , 1 977, modified, vide infra). Using this 
probe, 1 0 clones were isolated. These were 
plaque-purified and re-hybridised with a 247- 
nucleotide fragment from BIX — 1 clone. This 
fragment, derived from nucleotides 3 — ^249 can 
be seen from Figure 5. It contains only sequences 
in agreement with the Katayama bovine factor IX 
amino acid sequence and was isolated by Hinf\ 
digestion of BIX — 1 plasmid DNA. Only a single 
clone gave a positive hybridisation signal with this 
247-nucleotide probe. This clone was further 
plaque-purified and the resulting clone was 
designated "lambda HIX — 1 ". 

(ii) Subsequent genomic clones 

A sub-clone, pATIXcVII, of recombinant human 
factor IX cDNA from human liver mRNA, and 
prepared as described in Section L below, was 
linearised by digestion with HindlU and BamH\. 
The resulting 2 kb cDNA molecule was purified by 
1% agarose gel electrophoresis. After 
electroelution, about 1 00 ng of this cDN A was 
nick-translated with [alpha ^p] dATP (see above) 
and used as a hybridisation probe to screen the 
Hae\\MAIu\ lambda phage Charon 4A human 
genomic DNA library for further genomic clones. 



using standard stringent hybridisation conditions. 
Two further human factor IX genomic clones, 
designated lambda HIX — 2 and lambda HIX — 3, 
were thus obtained, 

70 H. Characterisation of human factor IX genomic 
clones 

(i) Restriction map 

The initial lambda HIX — 1 clone was 
characterised by cleavage with various single and 

75 double digests with different restriction 

endonucleases and Southern blotting of fragments 
using the bovine factor IX cDNA probe (results 
omitted for brevity). The subsequently isolated 
lambda HIX — 2 and 3 clones were characterised 

80 in the same way except that the human cDN A 
probe, pATIXcVII (see Section L below) was used 
for the Southern blots. From these results it 
emerged that the sequences in the factor IX 
genome corresponding to lambda HIX — 2 and 3 

85 overiapped with lambda HIX — 1 as shown in 
Figure 6(e). In Section (d) of this Figure 6 are 
summarised the results of the analysis using the 
restriction enzymes £coRI (E), HiniflW {H),Bgn\ (B), 
BamH\ (Ba) and PvuW (P), and this serves as a 

90 restriction enzyme map. 

(ii) Sequencing 

Numerous sub-clones were isolated from a 
knowledge of the rectriction enzyme map as 
described In Section J(ii) below, the majority in a 

95 vector pATI 53/Pvull/8. Examples of these sub- 
clones are shown in Figure 6(c) and a number 
were used and were of a convenient length for 
sequence analysis by the Maxam-GIIbert method 
(Maxam & Gilbert, Proc.Natl.Acad.Scj.USA 74. 

100 56—564,1980). 

Initially sequencing was done on part of a 1 .4 
kb £coRI restriction fragment from the sub-clone 
pHIX — 1 7, see below and J(i). A 403-nucleotide 
(base-pair) length was sequenced, of which a 

105 1 29-nucleotide length was identified as lying 

within an exon region. This is the 129-nucleotlde 
sequence used above to define the factor IX DNA. 

Subsequently, a region of 1 1 873 bases was 
sequenced in the central portion of the gene (see 

110 Rgure 6(b)]. Figure 7 shows the sequence of one 
strand of the DNA. The nucleotides are arbitrarily 
numbered from 1 to 1 1 873 in the 5' to 3' 
direction. The original 403-nucleotide sequence 
runs from Figure 7 nucleotides Nos. 4372 to 4774 

115 and Is indicated by O — O'. The 1 29-nucleotide 
sequence lying within the 403 one, runs from 
Figure 7 nucleotides Nos. 4442 to 4570 and Is 
Indicated by J — ^J'. This corresponds exactly to the 
' W exon. 

1 20 In detail, the sequence of nucleotides Nos. 
1 — 7830 contains two short axons (nucleotides 
4442 — 4570 and 7140 — 7342 respectively) 
marked w and x In Figure 6(a), J — J' and J' — J" in 
Figures 7 and 9. These code for amino acids 

125 85 — 127. and 128 — 1 95 respectively of the 
amino acid sequence predicted from the human 
factor IX cDNA clone (Figure 9). There are no 
differences in amino acid sequences predicted 
from the genomic and cDNA clones of the 
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invention in these two exon regions. The sequence 
of the gene between residues 783 1 — 1 1 873 is 
less complete, containing several gaps, but is still 
a useful characterisation of the gene as it contains 
5 two "Alu\ repeat" sequences, nucleotides 

7960 — 81 55 and 9671 — 9938.>a/«/l sequences 
are found in many genes. The repetition is not 
exact but there is a typical degree of homology 
between them. This further characterisation 
1 0 provides a useful cross-check on the accuracy of 
the restriction enzyme map. This emerges more 
clearly from the restriction enzyme chart of Figure 
8. 

Figure 8 is a chart produced by a computer 

1 5 analysis of the sequence data of the 1 1 873 

nucleotide long sequence of Figure 7. Column 1 of 
Figure 8 gives the arbitrary nucleotide number 
allotted to the nucleotide of Figure 7. Column 2 
apportions the nucleotide number as a fraction of 

20 the whole sequence. Column 3 shows the 

restriction enzymes which will cut the DNA within 
various short sequences of nucleotides shown in 
Column 4. The short sequences of Column 4 begin 
with the nucleotide numbered in Column 1 . With 

25 the aid of this chart the positions of the restriction 
sites shown in Figure 6(d) and some of the 
sequences shown in Figure 6{c) can be 
determined very accurately. For example 
sequences II — IV are produced by restriction at 

30 the following sites (denoted by the first nucleotide 
number at the 5' end of each site). 

II 3624 — 4769 

III 6380 — 7378 

IV 10589—11868 

35 Particulariy important sites are arrowed in Figure 
8. Some of the relevant nucleotide numbers are 
shown in Figure 6(c), the number given being that 
of the nucleotide at the 5' end of each site. 

Further sequence analysis of the sub-clones V, 

40 VI, Vll and VIII shown in Figure 6(c) indicates that 
the factor IX gene is divided into at least 7 exon 
regions separated by at least 6 introns. The 
positions of the exons are shown in Figure 6(a) by 
the solid blocks labelled t, u, v, w, x, y and z. The 

45 "z" exon Is much the longest and its 3'-end 
coincides with the 3'-end of the mRNA. The 
location of these exons relative to the cDNA 
sequence is discussed below (section L) and it is 
clear that the "t" exon shown in Figure 6(a) is not 

50 a marker for the 5'-end of the gene, as its 

sequence fails to match that of the extreme 5'-end 
of the cDNA clone (see below). This suggests that 
the factor IX gene will be longer at its 5'-end than 
the 27 kb region shown in Figure 6, and will 

55 contain at least one further exon. 

Additionally, pHIX — 1 7 DNA was digested with 
EcoRI. The digested material was resolved on 
0.8% agarose gel and a 1 .4 kb fragment was 
isolated in solution by electroelution. It can be 

60 stored in the usual manner. This 1 .4 kb long 

molecule was used for the initial sequencing. Only 
about 1 .0 kb is inserted DNA, the remaining 0.4 kb 
being of pBR322. A 403 nucleotide length of the 
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inserted DNA was sequenced and is idertified as 
65 O — O' In Figure 7. The same 1 .4 kb fracment was 
also labelled and used as a probe in Section M. 

I. Construction of a vector pATI 53/Pvull/8 

A derivative of the plasmid pATI 53 (Twig & 
Sherratt, Nature 283, 2 1 6—2 1 8, 1 980) was 

70 prepared for subcloning of PvuW fragments of 
factor IX genomic clones, and for ease of 
characterisation of the resultant subclones. Two 
partially complementary synthetic 
deoxyoligonucleotides, oligo N3, and, oligo N4, 

75 were synthesised by the solid phase 

phosphotriester method described in Section C 
above. Each has "overhanging" BamH\ and HinMl 
recognition sequences and an internal PvuW 
recognition sequence. Figure 10 shows the 

80 structures of oligo N3 and oligo N4. BamH\ and 
Hmd\\\ cleave ds DNA to leave sticky or 
"overhanging" ends. For example Hind\\\ cleaves 

— AAGCTT 

— TTCGAA 

85 between the adenine-carrying nucleotides of each 
strand leaving the sticky-ended complementary 
strands: — 

— A 

— TTCGA 

90 which are present in the oligo N3/N4 combination. 
pAT1 53 was digested with Hmd\\\ and BamH\ 
and the 3393 nucleotide long linear fragment was 
separated from the 346 nucleotide shorter 
fragment by 0.7% agarose gel electrophoresis, 
95 followed by electroelution of the appropriate 
bands visualised by ethidium bromide 
fluorescence under UV light. After treatment with 
calf intestinal phosphatase, as described in 
Section DiS), the BamH\-Hmd\\\ 3393-long 

100 fragment was ligated to an equimolar mixture of 
oligo N3 and oligo N4 which themselves had been 
pretreated, as a mixture, with T4 polynucleotide 
kinase and ATP, to phosphorylate their respective 
5'-terminal OH groups. After transforming 

105 competent MC 1061 cells (see above) and plating 
on L-broth plates containing 20 micrograms/ml 
final concentration of ampicillin, 11 colonies were 
selected for further analysis. 1 ml plasmid 
preparation, see Holmes and Quigley, Analytical 

1 10 Biochem. / 14, 1 93—197 (1 981), was isolated 
from the 1 1 colonies. The plasmid DNA was then 
analysed for its ability to be linearised by the 
restriction enzymes BamH\, Hmd\\\ and PvuW. Four 
clones were positive in this assay and one, 

115 labelled pATI 53/Pvull/8, was selected for 

sequence analysis by the Maxam^Gilbert method 
across the newly constructed section of the 
plasmid. This part of the sequence is shown r 
Figure 1 1 along the unique restriction sites. Tue 

120 novel part of the plasmid sequence is underiined: 
the remainder is present in the parent plasmid 
pATI 53. The vector allows blunt-end cloning 
(after treatment with phosphatase) into the 
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inserted PvuW site. The cloned DNA can be 
excised, assuming that it lacks appropriate internal 
restriction sites, with BamH\/Htnd\\\, BamH\/Ci&\ or 
BamH\/Ecof\\ double digests. The sites adjacent to 
5 the PvuW site are also convenient for end labelling 
with for characterization of the ends of cloned 
DNA by the Maxam-Gilbert sequencing method. 

J. Sub-cloning of human factor IX gene 

The following subcloning experiments were 

1 0 carried out as a first step towards sequencing of 
the factor IX gene« and to facilitate the isolation of 
a small DNA fragment to be used as a probe for 
the analysis of genomic DNA from haemophilia B 
patients (see sections M). 

1 5 (i) Sub-cloning into pBR322 plasmid 

An approximately 1 1 kilobase BglW fragment 
(see Figure 6) within the factor IX DNA insert in 
lambda HIX — 1 clone was inserted into the BamH\ 
site of pBR322. Transformation was carried out in 

20 the £.co// strain, HB 1 01 . The resulting "sub- 
clone" was designated pHIX — 17 (Figure 12). 
(i) Sub-cloning into pATI 53/PvuN/8 
(a) Plasmid DNA from pHIX — 1 7 was prepared 
and cleaved with PvuW. Five discrete fragments, all 

25 derived from the DNA insert of pHIX — 1 7, were 
isolated. The sizes of these fragments were 
approximately 2.3, 1.3, 1.2, 1,1 and I.Okilobases. 
These fragments were blunt-end ligated into the 
PvuW site of the pATI 53/Pvull/8 vector and 

30 transformed into E.coli HB 1 01 . Five clones of 
recombinant DNA which carried the 2.3, 1 .3, 1 .2, 
1 .1 and 1 .0 kb fragments were obtained and these 
were designated pATIXPvu-1 , 2, 3, 4 and 5 
respectively. Factor IX DNA from pATIXPvu-2 is 

35 abbreviated as IV and pATIXPvu-5 as Ml in Figure 
. 6(c). 

(b) Phage DNA from the lambda HIX — 1 genomic 
clone was digested with £coRI. Three different 
fragments (approximately 5, 2.3, 0.96, kb; see 

40 Figure 6), all derived from the insert into the 
phage, were isolated and inserted In 
pATI 53/PvuII/8 vector at the FcoRI site and 
cloned in E.coli HB 1 01 to form sub-clones. The 
three resulting clones for each of these fragments 

45 were designated pATIXEco-1 , 2 and 4 respectively 
which are shown in the restriction map of Figure 
6(d). pATIXEco-1 was further digested with both 
£coRI and BglW, and the "overhanging ends" of the 
restriction sites filled in with deoxynucleotide 

50 triphosphates using the Klenow fragment of DNA 
polymerase L After isolation of the resulting 1.1 kb 
fragment by agarose gel electrophoresis and 
electroelution, it was blunt-end ligated using T4 
DNA ligase into the PvuW site of pATI 53/Pvull and 

55 allowed to transform Exofi MC 1 06 1 . The 

resultant sub-clone was designated pATIXBE and 
the factor tX DNA sequence thereof is abbreviated 
as II in Figure 6(c). 

(c) Phage DNA from lambda HIX — 2 was 
60 digested with HindW\ and EcoB\ giving a 1 .8 kb 

and a 2.6 kb fragment amongst others. These 
fragments were eluted separately, filled in as 
described in (b) above, cloned as above into the 
PvuW sit.e of pAT1 53/Pvuil/8 and allowed to 



65 transform F. CO// MC 1061 . The resultant clones 
were designated pATIXHE — 1 , and the factor IX 
DNA sequence thereof is abbreviated as V in 
Figure 6(c), and pATlXEco — 6 and the factor IX 

70 DNA sequence thereof is abbreviated as VI in 
Figure 6(c). 

(d) Phage DNA from lambda HIX — 3 was 
digested with £coRI and Hind III and the fragments 
of 2.3 kb and 2.7 kb were sub-cloned exactly as 

75 described in (c) above. The resultant clones were 
designated pATIXEH — 1 , abbreviation VII in Figure 
6(c), and pATIXHE — 2, abbreviation VIII in Figure 
6(c). 

K. Preparation of a library of cDNA clones from 

80 human liver mRNA 

Messenger RNA was extracted from a human 
liver and a 20 — ^22 Svedberg unit enriched fraction 
of mRNA prepared exactly as described for bovine 
mRNA in Section B above, except that a 

85 'translation assay' was not used. The first steps In 
the construction of the double-stranded DNA were 
carried out using the 'Stanford protocol' kindly 
supplied from Professor P Berg's department at 
Stanford University, USA. This itself is a 

90 modification of Wickens, Buell 8- Schimke 
(J.Biol.Chem. 253, 2483—2495, 1 978) and 
some further modifications, incorporated in the 
description given below were made in the present 
work. 

95 For the first strand cDNA synthesis 6 

micrograms of polv(A)^" 20 — 22S human mRNA 
was incubated with 5 microfitres of 1 0x buffer 
(0.5 M Tris-chloride, pH 8.5 at room temperature, 
0.4 M KCI, 0.008M MgClj and 4 mM 

1 00 dithiothreitol), 20 microlitres of a 2.5 mM mixture 
of each of the four deoxynucleoside triphosphates, 
0.5 microlitres of oligo dTj^j-is)' ^ mlcrolitre 
(containing 0.5 microcurie) of lalpha-^^P] -dATP, 2 
microlitres of reverse transcriptase ( 1 4 units per 

1 05 microlitre) and the volume made up to 50 

microlitres with deionized water. After incubation 
for 1 hour at 42**C, the solution was boiled for 1 4- 
minutes and then rapidly cooled on ice. The 
second strand synthesis was carried out by adding 

110 directly to the above solution 20 microlitres of 5x 
second strand buffer (250 mM Hepes/KOH pH 6.9. 
250 mM KCI, 50mM MgCI,), 4 microlitres of a 
2.5 mM mixture of each of the four 
deoxynucleoside triphosphates, 10 microlitres of 

115 E.coli DNA polymerase 1 (6 units per microlitre) 
and making the volume of the solution up to 1 (X) 
microlitres with deionized water. After incubation 
for 5 hours at 1 5°C, S, nuclease digestion was 
carried out by addition of 400 microlitres of S, 

1 20 nuclease buffer (0.03 M sodium acetate pH 4.4, 
0.25 M NaCl, 1 mM ZnS04) and 1 microlitre of S, 
nuclease (at 500 units per microlitre). After 
incubating for 30 minutes at 37°C, 1 0 microlitres 
of 0.5M EDTA (pH 8.0) was added. Double 

125 stranded DNA was deproteinised by shaking with 
an equal volume of a phenol: chloroform (1 :1) 
mixture, followed by ether extraction of the 
aqueous phase and precipitation of ds DNA by 
addition of 2 volumes of ethanot. After 1 6 hours at 
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— 20°C, ds DMA was recovered by centrifugation. 65 
DNA polymerase I "fill in" of ends was carried 
out by a further incubation of the sample dissolved 
in 25 microlitres of 50 mM tris-chloride, pH 7.5, 
5 10 mM MgCl2, 5 mM dithiothreitol and containing 

0.02 mM dNTP and 6 units of DNA polymerase I, 70 
After incubating for 1 0 minutes at room 
temperature, 5 microlitres of EDTA (0.1 M at pH 
7.4) and 3 microlitres of 5% sodium dodecyl 
10 sulphate (SDS) were added. 

The following part of the protocol differs from 75 
the 'Stanford protocol'. The sample was 
fractionated on a "mini"-Sephacryl S400 column 
run in a disposable 1 ml pipette in 0.2 M NaCL 10 

15 mM tris-chloride, pH 7.5 and 1 mM EDTA. The 

first 70% of the "break-through" peak of 80 
radioactivity was pooled (0.4 ml) and 
deproteinised by shaking with an equal volume of 
n-butanolxhioroform (1 :4). To the aqueous phase 

20 was added 1 microgram of yeast RNA (BDH) as 

carrier followed by 2 volumes of ethanol. After 16 85 
hours at — 20*^C double stranded DNA was 
recovered by centrifugation for blunt-end ligation 
Into caff Intestinal phosphatase-treated Pvt/ll-cut 

25 pAt1 53/Pvull/8, using T4 DNA ligase (see I and 

J(il) above). After performing a trial experiment, it 90 
was found that when the bulk of the sample was 
incubated with 200 nanograms of vector DNA in a 
suitable buffer (1 mM ATP, 50 mM Tris-chloride, 

30 pH 7.4, 1 0 mM MgClj and 1 2 mM dithiothreitol) 

and using 1 0 microlitres of T4 DMA ligase in a 95 
total volume of 0.2 ml, then on subsequent 
transformation of competent Eco/f MC 1 06 1 cells 
a total of 58,000 ampicillin-resistant colonies 

35 were obtained. Up to 20% of these were 

estimated to derive from "background" non- 1 00 

recombinants derived by reiigation of the vector 
itself. This 20 — 22S cDNA library was amplified 
by growing the E.cofi for a further 6 hours at 37*»C. 

40 1 ml aliquots of this amplified library were stored at 

— 20®C in L broth containing 1 5% glycerol, before 1 05 
screening for factor IX cDNA clones. 

L. Isolation and sequence analysis of human factor 
iX cDNA clones 

45 6000 colonies of the amplified 20 — 22S 1 10 

human cDNA library were plated on each often 
1 5 cm agar plates and after growing overnight were 
blotted into Whatman 541 filter paper. After 
preparing filters for hybridisation as described in 

50 section E(i) above, the immobilised colonies were 115 
probed with a 1.1 kb molecule of [alpha-^^pj -nick 
translated human factor IX genomic DNA isolated 
from the pATIXBE subclone (Section J, above). 
This linear 1.1 kb section of factor IX genomic 

55 cDNA was isolated from pATIXBE by cleavage 1 20 

with the restriction enzymes BamHl and Hind\\\, 
followed by separation of the 1 .1 kb section from 
the vector by 1 .5% agarose gel electrophoresis. After 
electroelution, nick-translation was carried out as 

60 before and the material used in a hybridisation 1 25 

reaction for 1 6 hours at 65°C in 3x SSC, 1 0x 
Denhardts solution, 0.1?^ SDS and 50 
micrograms/ml sonicated denatured E,coli DNA 
and 100 micrograms/mi of sonicated denatured 



herring sperm DNA. After hybridisation filters were 
washed at 65°C successively in 3x SSC. 0.1% 
SDS (2 changes, half an hour each) and 2x SSC, 
0.1% SDS (2 changes, half an hour each). After 
radioautography. 7 clones were selected as 
positive, but on dilution followed by re-screening 
by hybridisation as above, only 5 proved to be 
positive. Plasmid DNA was isolated from each of 
these 5 clones and one, designated pATIXcVII. 
was selected for sequence analysis as it appeared 
to be the longest of the 5 clones as judged by its 
electrophoretic mobility on 1 % agarose gel 
electrophoresis. A second shorter clone, 
designated pATIXcVII was also selected for partial 
sequence analysis. 

Sequencing was carried out by the Maxam- 
Gilbert method and a 2778 nucleotide long 
section of sequence is shown in Figure 9. 
Nucleotides 1 1 5 — 2002 were derived by 
sequencing clone pATIXcVII. (The actual extent of 
this clone is greater as it extends in a 5' direction 
to nucleotide 1 7. The sequence between 1 7 and 
1 1 1 is Inverted with respect to the remainder of 
the sequence presumably due to a cloning 
artefact.) Nucleotides 1 — 1 30 were derived from 
clone pATIXcVI which extends from nucleotides 
1 — 1 548 of Figure 9. The sequence from Nos. 
2002 — ^2778 was derived by isolating 4 
additional clones designated pATlX 108.1, 
pATIXI 08.2, pATIXI 08.3 and pATIXDB. The first 
3 were derived from a mini-library (designated 
GGB1 08) of the cDNA clones constructed exactly 
as described in section K above except that 
sucrose density gradient centrifugation was used 
instead of chromatography on "Sephacryl" 
S — 400 to fractionate the double-stranded DNA 
according to size. A fraction of m.w. from 1 kb — 5 
kb was selected and an amplified library of 1 0,000 
independent clones containing approximately 20% 
background non-recombinant clones was 
obtained. Clone pATIXDB derived from another 
cDNA library (designated DB1 ) constructed as 
described in section K except that total poly A-»- - 
human liver mRNA was used as the starting 
material and sucrose density gradient 
centrifugation was used to fractionate the DNA 
according to size as in the construction of the 
minl-iibrary GGB108. The complexity of this 
library was 95,000 with an estimated background 
of non-recombinants of 50%. Clones pATIXI 08.1 
and pATlXI 08.2 were selected from a group of 30 
hybridization-positive clones isolated by 
Grunstein-Hogness screening of the mini library 
GGB108 using a ^^p-nick translated probe derived 
from a SaaSAI restriction enzyme fragment, itself 
derived from nucleotides 1 796 — 2002 of clone 
pATIXcVIL From pATIXIOB.I the sequence of 
nucleotides 2009 — 2756 was determined (Figure 
9). Following this the sequence of a part of 
pATIXI 08.2, specifically nucleotides 
1 950 — 2086, provided the overlap with 
pATIXcVII. The remaining 28 hybridization 
positive clones were screened by carrying out a 
triple enzymatic digestion with the restriction 
enzymes £coRI, BamH\ and HindWX and screening 
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the product of the digest for an EcoBl restriction 
fragment extending in the 3' direction from the cut 
at position 2480. By this approach, clone 
pATIX108.3 was selected and sequenced from 
5 nucleotides 2642 — 2778. This clone was 

followed by three A nucleotides, which sequence 
was confirmed as a vestigial marker for the poly A 
tail, by the subsequent isolation of clone pATIXDB 
by a similar method. pATIXDB was sequenced 
10 from Nos. 2760 — 2778 and ended in 42 A 

nucleotides, thus marking the 3' end of the mRNA. 

Figure 9 shows that the predicted amino acid 
sequence codes codes for a protein of 456 amino 
acids, but included in this are 41 residues of 

1 5 precursor amino acid sequence preceding the IM- 

» 

terminal tyrosine residue (Y) of the definitive 
length factor IX protein. The precursor section of 
the protein shows a basic amino acid domain 
(amino acids —1 to —4) as well as the more usual 

20 hydrophobic signal peptide domain (amino acids 
-21 to -36). 

The definitive factor IX protein consists of 41 5 
amino acids with 1 2 potential gamma- 
carboxyglutamic acid residues at amino acids 7, 8, 

25 1 5, 1 7, 20, 2 1 . 26, 27, 30, 33, 36 and 40. Two 
potential carbohydrate attachment sites occur at 
amino acid residues 1 57 and 1 67. The activation 
peptide encompasses residues 1 46 — 1 80, which 
are cut out in the activation of Factor IX (see 

30 Background of Invention) by the peptide cleavage 
of an R — A and R — V bond. This leaves a light 
chain spanning residues 1 — 145 and a heavy 
chain spanning residues 1 81 — 41 5. 

The exact location of the boundaries between 

35 exons {see Section H, above) and how they are 
joined in the mRNA Is marked in Figure 9. The 
exons are marked t, u, v, w, x, y, z. It can be seen 
that there is a rough agreement between the exon 

40 domains and the protein regions. For example, the 
exon for the signal peptide is distinct from that of 
the GLA region. Also that of the activation peptide 
IS separated from the serine protease domain. 

The 3' non-coding region of the mRNA is 
extensive, consisting of 1 390 residues (including 

45 the UAAUGA double terminator 1 389 — 1 394 but 
excluding the poly A tail). 

The factor IX cDNA is cleavable by the 
restriction enzyme Hae\\\ to give a fragment from 
nucleotides 133 — 1440 i.e. a 1307 nucleotide 

50 long region of DNA entirely encompassing the 
definitive factor IX protein sequence. The 
nucleotide sequence recognised by Hae\\\ is 
GGCC. This fragment should be a suitable starting 
material for the expression of factor IX protein 

55 from suitable promoters in bacterial, yeast of 

mammalian cells. Another suitable fragment could 
be derived using the unique Stul site at residue 41 
(corresponding to an early part of the hydrophobic 
signal peptide region) and linking it to a suitable 

60 promoter. The nucleotide sequence recognised by 
Stul is AGGCCT 



M. Southern Analysis of normal and patient 
Christmas disease DNA 

(i) Norma/ 

65 The standard (Southern) bloning procedure. 
Southern, J.Mol. Biol. 98, 503 — 517, 1975) was 
used. In a typical experiment, 10 — 20 micrograms 
of human genomic DNA (prepared from 
uncultured blood cells or cultured lymphocytic 

70 cells) were digested with one of a number of 

restriction endonucleases and loaded onto a single 
gel slot. Following electrophoresis on 0.8% 
agarose gel and transfer onto nitrocellulose it was 
hybridised with a probe of ^^P- labelled probe II or 

75 of 1 .4 kb EcoH\ fragment (see Section H). 

Labelling of the probe was carried out by nick 
translation using the method of Rigby et al., supra, 
modified as follows. About 100 nanograms of the 
probe was mixed with 40 microcuries of [alpha 

80 32p] jj^jp (activity about 3.000 Curies/mMole, 
obtained from Amersham International PLC) in 
0.05M Tris-HCI, pH 7.5, 0.0 1 M MgClj, 0.00 1 M 
dithiothreitol and dCTP, dGTP, dTTP each at a 
final concentration of 20 micromolar in a volume 

85 of 29 microlitres. To this was added 1 microlitre of 
"solution X" made up of a mixture of 6 units of 
DNA polymerase I {E.co/J), 0.6 nanograms of 
pancreatic DNase I (Worthington), 1 microgram of 
crystalline BSA in 1 0 microlitres of 50% v/v 

90 glycerol containing 0.05M Tris-HCI, pH 7.5, 
0.01 M MgCIj and 0.001 M dithiothreitol. The 
mixture was incubated for 2 hours at 1 5°C, after 
which high molecular weight DNA was purified by 
chromatography on G — 100 "Sephadex'*. Figure 

96 13 shows the major bands obtained with DNA 
from normal individuals probed with either probe II 
(Figure 6) or labelled 1 .4 kb EcoR\ fragment. With 
each of the 4 enzymes used, £coRI, Hmc/\\\, Bg/\\ 
and Bc/\. a single major band of about 4.8, 5.2, 1 1 
''OO and 1.7 kb was obtained. 

The fact that these restriction fragments had 
the same length as those observed in the 
restriction map of clone lambda HIX — 1 confirmed 
that the conditions of Southern blotting were 
1 05 precise enough to detect the factor IX gene in total 
DNA preparations. This provides the basis for 
analysis of DNA from the blood of patients with 
Christmas disease. 

(ii) Christmas patients with gene deletions 

110 The value of the probes of the invention for the 
assay of alterations of genes of some patients 
suffering from Christmas disease has been 
demostrated as follows. Two patients with severe 
Christmas disease, who also developed antibodies 

115 to factor IX, were selected for study. The DNA 

from 50 mf of blood was digested separately with 
£coRI, Hind\\\, Bgl\\ and Bcl\ and Southern blots 
prepared for probing with ^^P-nick translated 
probe II (Figure 6). No specific bands were 

1 20 observed with either patient under conditions 
where a control digest gave the pattern shown in 
Figure 13. Similarly no bands were observed in 
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the patients when probe L III or IV (Figure 6) was 
substituted for probe M. In order to control for 
possible mischance of some experimental artefact 
giving the observed 'negative' signal, a factor IX 
5 gene probe (this time pATIXcVII — the cDNA 
probe) was mixed with an irrelevant autosomal 
gene probe which was specific for the human Al 
apolipoprotein (Shoulders and Baralle, Nucleoids 
Res. 10, 4873 — 4882, 1 982). This experiment 

1 0 showed that patient 1 had normal Al 

apolipoprotein gene, characterised by a 12 kb 
band in an EcoTW digest, and confirmed that he 
lacked the 5.5 kb band observed with pATIXcVII 
and characteristic of the normal factor IX gene. It 

1 5 was concluded that both patients have a sequence 
of at least 1 8 kb deleted from their factor IX gene. 
Two other patients, designated patients 3 and 4, 
who had also developed antibodies to factor IX 
gave bands in the normal or abnormal positions on 

20 Southern blots with some factor IX gene probes of 
the invention, but not with others. This suggested 
that these patients had less extensive deletions of 
the gene, possibly about 9 kb in length. 
These results suggest that diagnosis of 

25 haemophiliacs and the heterozygous (carrier) 
females would be possible In families and this is 
now under examination. The altered pattern seen 
in the patient's DNA, whether absence of a band 
or the presence of a band in an abnormal position, 

30 serves as a "disease marker", which can be used 
to assess for its presence or absence in a 
suspected carrier. This same test can be applied to 
antenatal diagnosis, if DNA from foetal cells are 
available from an amniocentesis. "Genetic 

35 diagnosis" should considerably improve existing 
methods of antenatal diagnosis based on the 
assay of foetal factor IX protein levels, with the 
added advantage that the test can be carried out 
eariier in pregnancy. Genetic methods using 

40 natural polymorphisms within the factor IX gene 
as allelic markers should also make 1 00% carrier 
deletion a reality, thereby improving the existing 
somewhat unsatisfactory methods where 
probability values are offered to patients. 

45 CLAIMS 

1 . Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a DMA 
sequence foreign to the cloning vehicle, the 
foreign sequence comprising substantially the 

50 following 129-nucieotide sequence (read in rows 
of 30 across the page): — 

ATGTAACATG TAACATTAAG AATGGCAGAT 
GCGAGCAGTT TTGTAAAAAT AGTGCTGATA 
ACAAGGTGGT TTGCTCCTGT ACTGAGGGAT 
55 ATCGACTTGC AGAAAACCAG AAGTCCTGTG 
AACCAGCAG 

2, Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a DNA 



sequence foreign to the cloning vehicle, the 
60 foreign sequence comprising substantially the 
following 203-nucleotide sequence (read in rows 
of 30 across the page): — 

TGCCATTTCC ATGTGGAAGA GTTrCTGTTT 
CACAAACTTC TAAGCTCACC CGTGCTGAGG 
65 CTGI I II ICC TGATGTGGAC TATGTAAATT 
CTACTGAAGC TGAAACCATT TTGGATAACA 
TCACTCAAAG CACCCAATCA TTTAATGACT 
TCACTCGGGT TGTTGGTGGA GAAGATGCCA 
AACCAGGTCA ATTCCCTTGG GAG 

70 3. Recombinant DNA which comprises a 
cloning vehicle DNA sequence and a sequence 
foreign to the cloning vehicle, the foreign 
sequence being substantially the same as a 
sequence occurring in the human factor IX 

75 genome. 

4. Recombinant DNA according to Claim 3 
wherein the human factor IX sequence has a 
length of at least 50 nucleotides. 

5. Recombinant DNA according to Claim 3 
80 wherein the length of the human factor IX 

sequence is from 75 to 27,000 nucleotides. 

6. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
foreign to the cloning vehicle, wherein the foreign 

85 sequence includes substantially the whole of an 
exon sequence of the human factor IX genome. 

7. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
foreign to the cloning vehicle, wherein the foreign 

90 sequence comprises a DNA sequence which is 
complementary to the human factor IX mRNA. 

8. Recombinant DNA according to Claim 3, 4 or 
5, wherein the cloning vehicle is a modified 
pATI 53 plasmid prepared by ligating a BamH\ 

95 and HinM\ double digest of pATI 53 to a pair of 
complementary double sticky-ended 
oligonucleotides having a DNA sequence 
providing a BamH\ restriction residue at one end, a 
Hind\\\ restriction residue at the other end and a 

100 PvuW restriction site in between. 

9. Recombinant DNA according to Claim 8 
wherein the pair of complementary 
oligonucleotides are of formula: — 

5' GATCCAGCTGA 3' 



3' GTCGACTTCGA 5' 

1 05 10. Recombinant DNA which comprises a 
cloning vehicle sequence and a DNA sequence 
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foreign thereto which hybridises to a 247 base- 
pair sequence of bovine factor IX DNA 
compiementary to messenger RNA and indicated 
in Figure 5 by the arrows at each end thereof. 
5 11 . A host transformed with at least one 
molecule per cell of recombinant DNA claimed in 
any preceding claim. 

1 2. A host according to Claim 1 1 in the form of 
E.coU. 

10 1 3. A host according to Claim 1 1 in the form of 
mammalian tissue cells. 

1 4. A process of preparing a host transformed 
with recombinant DNA as claimed in any one of 
Claims 1 to 7, which process comprises: — 

15 ( 1 ) synthesising an oltgodeoxynucleotide probe 
having a nucleotide sequence comprising that 
occurring in bovine factor IX messenger RNA 
coding for amino acids 70 — 75 or 348 — 352 of 
bovine factor IX and labelling the 

20 oligodeoxynucleotide to form a probe; 

(2) preparing complementary DNA to a mixture of 
bovine RNA; 

(3) inserting the complementary DNA in a cloning 
vehicle to form a mixture of recombinant bovine 

25 cDNAs; 

(4) transforming a host with said mixture of 
recombinant bovine cDNAs to form a library of 
clones and multiplying said clones; 

(5) probing the clones with the synthetic 

30 oligodeoxynucleotide probe obtained in step 1 and 
isolating a resultant recombinant bovine factor IX 
cDNA-containing clone; 

(6) digesting the recombinant bovine factor IX 
cDNA from said clone with one or more enzymes 

35 to produce a bovine factor IX cDNA molecule 

containing a shorter sequence of bovine factor IX 
DNA; and 

(7) probing a library of recombinant human 
genomic DNA in a transformed host with the 

40 shorter sequence bovine factor IX cDNA molecule, 
to hybridise the human genomic DNA to the said 
recombinant bovine factor IX DNA and isolating 



the resultant recombinant DNA-transformed host. 
1 5. A process of preparing a host transformed 

45 with recombinant DNA as claimed in Claim 1 , 2 or 
1, which process comprises probing a library of 
clones containing recombinant DNA 
complementary to human mRNA with a probe 
comprising a labelled DNA comprising a sequence 

50 complementary to part or all of an exon region of 
the human factor IX genome. 

1 6. A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part or all of 
substantially the 129-nucleotide sequence set 

55 forth in Claim 1 . 

1 7. A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part or all of 
substantially the 203-nucleotide sequence set 
forth in Claim 2. 

60 1 8 A DNA molecule comprising an at least 1 5 
nucleotide long sequence of part only of the DNA 
sequence of the human factor IX genome. 

1 9. A DNA molecule comprising a sequence of 
length at least 1 5 nucleotides substantially the 

65 same as a sequence complementary to part or all 
of that occurring in human factor IX mRNA. 

20. A DNA molecule according to any one of 
Claims 1 6 to 1 9 of length at least 50 nucleotides. 

21. An artificial DNA molecule comprising a 
70 sequence substantially the same as a sequence of 

length at least 1 5 nucleotides occurring in the 
human factor IX genome. 

22. An artificial DNA molecule according to 
Claim 21 comprising substantially only exon 

75 sequences. 

23. A labelled diagnostic probe comprising a 
DNA molecule having a single-stranded or double- 
stranded probe sequence of from 1 5 to 1 0,000 
nucleotides long of DNA sequence defined in 

80 Claim 1 6, 1 7, 1 8 or 1 9 or its complementary 
sequence. 

24. A probe according to Claim 23 having a 
probe sequence from 20 to 5,000 nucleotides 
long. 
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